Failure-Resilient Computations in the EcliPSe System
نویسندگان
چکیده
Local or wide-area connected workstation cluster-based computation systems are inherently failure-prone, particularly for long running computations. In this work we introduce a variety of features for failure resilience in the EcliPSe system for replicative applications. Key characteristics of fault-tolerant EcliPSe are ease of use, low statesaving costs, system scalability and good performance.
منابع مشابه
Fail-safe concurrency in the EcliPSe system
Local or wide-area heterogeneous workstation clusters are relatively cheap and highly effective, though inherently unstable operating environments for long-running distributed computations. We found this to be the case in early experiments with a prototype of the EcliPSe system, a software toolkit for replicative applications on heterogeneous workstation clusters. Hardware or network failures i...
متن کاملVerification of Monitor unit calculations for eclipse Treatment Planning System by in- house developed spreadsheet
Introduction: Computerized treatment planning is a rapidly evolving modality that depends on hardware and software efficiency. Despite ICRU recommendations suggesting 5% deviation in dose delivery the overall uncertainty shall be less than 3.5% as suggested by B.J. Minjnheer. J. In house spreadsheets are developed by the medical physicists to cross-verify the dose calculated by the Treatment Pl...
متن کاملOn the Effectiveness of Superconcurrent Computations on Heterogeneous Networks
Concurrent computing on networked collections of computer systems is rapidly evolving into a viable technology that is attractive from the economic, performance, and availability perspectives. Several software infrastructures that support such heterogeneous network-based concurrent computing have evolved, and are in use for production-quality high-performance computing. In this paper, we descri...
متن کاملRexsss Performance Analysis: Domain Decomposition Algorithm Implementations for Resilient Numerical Partial Differential Equation Solvers
The future of extreme-scale computing is expected to magnify the influence of soft faults as a source of inaccuracy or failure in solutions obtained from distributed parallel computations. The development of resilient computational tools represents an essential recourse for understanding the best methods for absorbing the impacts of soft faults without sacrificing solution accuracy. The Rexsss ...
متن کاملResilient Configuration of Distribution System versus False Data Injection Attacks Against State Estimation
State estimation is used in power systems to estimate grid variables based on meter measurements. Unfortunately, power grids are vulnerable to cyber-attacks. Reducing cyber-attacks against state estimation is necessary to ensure power system safe and reliable operation. False data injection (FDI) is a type of cyber-attack that tampers with measurements. This paper proposes network reconfigurati...
متن کامل